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SIGNIFICANCE  AND  EXPLANATION 


This  paper  proposes  a  procedure  for  testing  non-nested  families  of 
hypotheses  which  substitutes  raw  computing  power  for  asymptotic 
approximations.  Given  access  to  a  modem  computer,  the  procedure  is 
practically  universally  applicable.  Examples  illustrating  its  small- 
sample  properties  are  provided,  as  are  theorems  on  its  asymptotic 
behavior . 
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A  PARAMETRIC  BOOTSTRAP  PROCEDURE  FOR  TESTING 
SEPARATE  FAMILIES  OF  HYPOTHESES 

Wei-Yin  Loh 

1 .  Introduction . 

This  paper  considers  the  problem  of  finding  general  procedures  for 
testing  separate  families  of  hypotheses,  separate  in  the  sense  that  an 
arbitrary  member  in  the  null  cannot  be  obtained  as  a  limit  of  members  in 
the  alternative  hypothesis.  A  fairly  general  test  has  been  proposed  in 
two  well-known  papers  by  Cox  (1961,  1962).  It  is  based  on  the  property 
that,  subject  to  regularity  conditions,  the  normalised  logarithm  of  the 
ratio  of  maximized  likelihoods  behaves  asymptotically  like  a  standard 
normal  random  variable. 

Specifically,  let  (X1,...,Xn>  be  a  random  sample  from  a 
distribution  with  density  h(x)  and  consider  testing  the  separate 
families  HQ  :  h(x)  *  f(x,0)  vs.  H1  :  h(x)  *  g(x,o>)  where  (0,«)  are 

A  A 

unknown,  possibly  vector-valued,  parameters.  Let  8,  u  be  the  maximum 
likelihood  estimates  of  0,  o>  respectively  and 

mm]  A  A 

T  ■  n  E  log{g(x  ,u>)/f (x.  ,0)}  .  (1.1) 

n  x  x 

Then,  under  Hq  and  subject  to  appropriate  regularity  conditions  (e.g. 
White,  1982), 

1/  2 

Z  =  n  2 (T  -  E„T  )  *  N(0,  a  (0))  (1.2) 

n  0  n 
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2  2 
c  (6)  >0.  In  this  paper  N(y,o  )  denotes  the  normal 
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distribution  with  mean  p  and  variance  a  .  Under  H  z  tends  to 
+  •  almost  surely.  Hence  large  values  of  Z  are  evidence  against  Hq. 
At  nominal  level  a,  Cox's  (1961,  1962)  test,  denoted  by  ^cox' 

A 

rejects  HQ  if  z  >  z^alS),  where  zq  is  the  N(0,1)  a-quantile. 

An  obviously  desirable  property  of  $£qX  i®  that  the  probability 
of  a  type  I  error,  a  ,  converges  to  the  nominal  level  as  n 
increases.  However,  no  study  seems  to  have  been  done  on  the  asymptotic 
behavior  of  the  size  of  ♦cox* 

A  variant  of  <fc_  v,  to  which  it  is  asymptotically  equivalent  under 
the  null  hypothesis,  has  been  suggested  by  Atkinson  (1970).  However 
this  was  shown  to  be  not  always  consistent  by  Pereira  (1977).  Some 
authors  have  proposed  tests  based  on  statistics  other  than  the 
likelihood  ratio.  Epps,  Singleton  and  Pulley  (1982)  use  the  empirical 
moment  generating  function,  and  Shen  (1982)  and  Sawyer  (1983)  derive 
their  tests  from  information  theoretic  considerations.  None  of  these 
tests  appear  to  be  superior  to  the  others. 

A  common  feature  of  the  tests  is  that  they  all  depend  on  some  form 
of  asymptotic  approximation,  and  so  require  various  regularity 
conditions  for  their  validity.  In  an  attempt  to  obtain  a  solution  to  a 


W 

V 


problem  where  such  conditions  are  absent,  Williams  ( 1970a, b)  considers  a 
different  approach  which  substitutes  raw  computing  power  for 
asymptotics.  Given  the  data,  Williams  (1970)  proposes  simulating  the 

A 

distribution  of  Tr  in  (1.1)  on  a  computer  assuming  that  0*9.  The 
null  hypothesis  is  rejected  at  the  nominal  a  level  if  the  observed 
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value  of  Tn  is  greater  than  the  (l-a)-quantile  of  the  sinulated 
distribution.  We  will  call  this  procedure  $pAR  since  it  has  been 
named  "parametric  bootstrap”  in  other  contexts  by  some  authors  (e.g. 


Efron,  1982). 

It  seems  pertinent  to  remark  here  that  although  the  idea  of  using 
computer  simulation  to  obtain  the  critical  values  of  a  test  statistic  is 
not  widely  practised  at  the  present  time,  there  are  problems  where  no 
better  alternative  exists.  For  example,  when  the  hypotheses  represent 
location-scale  families  of  distributions,  the  well-known  uniformly  most 
powerful  invariant  test  statistic  is  a  ratio  of  multiple  integrals  whose 
null  distribution,  exact  or  approximate,  is  unknown.  However  it  can  be 
approximated  to  any  desired  degree  of  accuracy  by  Monte  Carlo  simulation 
quite  easily,  especially  as  the  null  distribution  is  independent  of  the 
unknown  parameters. 

One  aim  in  this  paper  is  to  compare  the  finite-sample  performance 

of  <fc  „  and  6  ,  with  emphasis  on  (i)  the  extent  to  which  the  size 

COX  PAR 

of  the  test,  a®,  exceeds  the  nominal  a,  and  (ii)  the  power  of  the 
tests.  It  is  shown  in  section  2  that  a® ( $pAR >  is  never  less  than  u. 
In  fact,  an  example  is  given  in  section  3  where  a®($pAR)  *  1  f°r 
a  and  n.  We  also  show  that  a® ( $cox )  can  be  either  biased  upward  or 
downward. 

To  reduce  the  bias  in  a  simple  modification,  is 

X  r  AK 

A 

proposed  in  section  2.  It  is  shown  that  if  under  Hg,  8  is  a 

consistent  estimator  and  the  1  -  a  quantile  of  Tn  is  a  sufficiently 

s 

smooth  function  of  8  for  each  n,  then  is  bounded  above  by 


No  explicit 


ate  for  eaae  e  >0  which  tends  to  sero  as  n  ♦  •. 
n  n 

conditions  on  the  Uniting  distributional  behavior  of  Tr  are 
assumed.  Sections  3  and  4  contain  examples  comparing  the  different 
tests. 


2.  Improving  on  ♦pj^* 

Lot  Tn  bo  given  in  (1.1).  For  each  6,  a  and  n,  define  the 

critical  value  c( 9 ,a,n)  such  that 

Pfl(T  >  c(0,a,n))  -  a  .  (2.1) 

o  n 

We  assume  for  simplicity  here  that  TR  is  a  continuous  random 

variable.  Let  c  (a,n)  -  sup  c(0,a,n).  If  c  (a,n)  is  known,  the  test 
m  Q  m 

which  rejects  Hq  if  Tn  >  c^(  a,n)  is  clearly  level  o.  In 
fact,  under  mild  regularity  assumptions,  4^^  is  asymptotically 
optimal  in  terms  of  Bahadur  efficiency  (see  e.g.  Bahadur,  1971  or  Brown, 
1971).  This  does  not,  of  course,  imply  that  i®  necessarily  most 

powerful  level  a  for  finite  n. 

While  c(0,a,n)  can  be  approximated  given  6,  a  and  n,  by 
brute-force  computer  simulation  if  necessary,  the  computation  of 
c  (a,n)  presents  a  much  harder  problem.  The  parametric  bootstrap 

■I 

A 

♦PAR  avoids  this  by  having  the  rejection  regiont  Tn  >  c(0,a,n). 

A 

Notice  that  since  c(0,a,n)  <  c  (a,n),  $  is  at  least  as  powerful  as 

♦r  Our  first  theorem  shows  that  this  is  obtained  at  a  cost. 

LRT 

Theorem  2.1.  For  all  a  and  n, 

<**(♦  )  -  sup  P.(T  >  c(0,a,n))  >  a  . 

I  FAR  g  on 

A 

Proof,  sup  P„(T  >  c(0,a,n))  >  sup  P«(T  >  c  (a,n))  ■  a. 

-  0  0  n  0  0  n  m 

The  bias  in  the  size  of  will  be  reduced  if  we  use  a  critical 

A  A 

value  for  TR  that  is  greater  than  c(0,o,n).  Assuming  that  0  is  a 
consistent  estimator  of  0,  this  can  be  done  in  the  following  way.  Let 

A 

I  (0)  be  a  100(1 -a  )%  confidence  interval  for  0  such  that  both  its 
n  n 
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length  and  a  tend  to  zero  as  n  tends  to  infinity.  The  idea  behind 

*  * 

the  test  we  will  propose  is  to  use  c  ■  c( 0  ,a,n)  as  the  critical 

*  * 

value  of  TR,  where  8  maximizes  c(8,a,n)  over  1^(6).  Of  course, 

* 

this  method  is  practicable  only  if  0  is  known  a  priori  for  all 


I  (8).  Section  3  contains  examples  where  this  is  the  case.  For  other 
n 

* 

cases,  8  can  at  best  be  approximated  with  an  "interval  of 

*  *  * 

uncertainty"  (8^,8^),  which  will  contain  8  if  it  is  assumed  that 

A 

c(8,a,n)  is  unimodal  in  I  (8).  Making  this  assumption,  the  standard 

n 

optimum-seeking  methods  like  the  Fibonacci  and  golden-section  search  can 
be  used.  Given  e'  >0,  each  of  these  two  methods  is  known  to  produce 
an  interval  of  uncertainty,  of  length  less  than  e',  with  a  minimum 
number  of  function  evaluations;  see  e.g.  Wilde  (1964). 

A  A  A  A 

To  approximate  c(8  ,a,n),  let  In(8)  “  and 

*  *  *  *  *  * 

e  "  ®2  ”  ®1*  SuPP°se  first  that  <  <  92  <  ®2*  Then  we  ™Ly  assume 

*  *  *  A 

without  loss  of  generality  that  8,J<8,)-e<02  +  e<02,  because  if 
this  were  not  true,  we  could  reduce  the  interval  of  uncertainty,  and 

hence  e,  by  continuing  the  golden-section  search.  Let  m1  = 

•  *  *  * 

c(8  ,a,n)  -  c(81  -  e,  a,n)  and  m2  -  c(82  +  e,  a,n)  -  c(82,a,n),  and 


define 


*  “  * 
c  <8,a,n)  =  max  {c(0.,o,n)  +  I m  | } 
i-1,2 


(2.2) 


If  8,  *  8 .  for  some  i  =  1,2,  we  set  m.  =  0.  Denote  by  <j»  the 
i  i  1 

*  * 

test  which  rejects  the  null  hypothesis  if  T^  >  c  (6,a,n).  If 
c(8,a,n)  is  sufficiently  smooth,  an  upper  bound  on  the  size  of  <|>#  can 
be  obtained. 
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Theorem  2.2. 

(i).  «*(♦*)  <  ®J<*PAR>* 

A* 

(ii).  Suppose  that  (a)  for  each  local  maximum  8  of  c(8,a,n)  there 
is  5  *  6(8)  >  0  such  that  c(8,a,n)  Is  concave  for  6  e  (6-26,8+26) 

A 

and,  (b)  with  H0-probability  one,  ( Q )  contains  at  most  one  local 

maximum  of  c(8,a,n).  Then  <*„,($*)  <  a  +  a  if  whenever  8  obtains 

i  n 

^  *  A 

and  8  e  I  (6),  we  choose  e  in  (2.2)  so  small  that  e  <  6(8). 
n 

Proof.  ( i ) .  Obvious . 

(ii).  The  assumptions  imply  that  for  every  8, 

A 

p  (♦.  reject.  H0)  <  Pj(Tn  >  c(6,a,n),  6  e  In(9)> 

<  u  +  a 

n 

Both  assumptions  in  part  (ii)  of  the  theorem  set  conditions  on  the 
smoothness  of  c(8,a,n).  Condition  (b)  is  necessary  to  ensure  that  the 
search  does  not  yield  the  'wrong'  local  maximum.  Although  the  local 
maxima  {8}  are  seldom  known  in  advance,  the  sequential  nature  of  the 
search  allows  the  experimenter  to  plot  the  points  {(8^ ^ ,c(8^^ ,o,n)} 
at  each  stage  and  decide  for  himself  whether  the  length  e  of  the 
current  interval  of  uncertainty  is  small  enough  for  stopping.  Stopping 
at  any  stage  amounts  to  making  assumption  (a)  of  the  theorem  with 

A 

6(8)  >  e,  if  there  exists  8  in  the  observed  In(8). 


In  the  above  discussion  it  is  assume  that  c(8,a,n)  can  be 
obtained  exactly  for  any  0  selected  by  the  search  procedure.  When 


c(0,a,n)  has  to  be  estimated  by  computer  simulation,  its  value  will  be 
subject  to  Monte  Carlo  error.  However,  since  this  error  can  be  made 
arbitraily  small  by  increasing  the  number  of  Monte  Carlo  replicates,  we 
assume  it  to  be  negligible  here. 


3.  Some  examples  permitting  analytic  solution. 

In  this  section  the  superiority  of  is  demonstrated  in  some 

classical  testing  problems  where  analytic  solutions  are  possible. 

Example  3.1.  Testing  a  normal  mean. 

2 

let  (Xj,...,x  )  be  a  random  sample  from  N(u»0  ),  -“  <  y  <  •, 

2  2 
a  >  0,  and  consider  testing  HQ  :  N(0,o  )  vs.  t  N(y,1).  it  is 

easily  seen  frost  (1.1)  that 

2  2—2 

T  “.5(1+  log  s  -  8  +  x  ) 

n 

—  2 

where  x  is  the  sample  mean  and  s  the  maximum  likelihood  estimate  of 
2 

o  .  Thus  $pAR  has  the  rejection  region 

.5(1  +  log  s2  -  s2  +  x2)  >  k ^^  ( s ) 

where  k  ( o )  is  the  a  quantile  of  T  wht i  a  obtains .  We  can 
oi  n 

avoid  the  evaluation  of  k^^ts)  by  rewriting  the  rejection  region  as 

—2  2  —2  2 
x  /s  >  k' .  Now  k' ,  being  the  1  -  o  quantile  of  x  /s  ,  is 

independent  of  s.  So  both  <j>pAR  and  are  equivalent  to  the  t- 

test,  which  is  uniformly  most  powerful  unbiased.  It  turns  out  that 

does  not  exist  for  this  problem  because  T  -  E„T  is  of  order 
wa  n  n 

° 

n  and  hence  the  LHS  of  (1.2)  converges  to  0  in  probability  under 

H0. 

Example  3.2.  Testing  a  normal  variance. 

2 

Let  (X.j,...,Xn)  be  a  random  sample  from  N(y,0  ),  -  •  <  u  <  •, 

2 

o  >0.  We  consider  testing  the  following  hypotheses. 

2  2 
(a)  HQ  :  o  *  1  vs.  :  0  *  2 

2  2 

It  can  be  verified  that  (1.1)  gives  TR  *  s  /4  +  constant,  where  s  is 

2 

the  maximum  likelihood  estimate  of  a  .  Since  the  distribution  of  T 
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is  independent  of  U/  $PAR  and  are  identical,  and  yield  the 

uniformly  most  powerful  test. 


An  easy  calculation  shows  that  at  the  nominal  a  level,  $ 


rejects  Hn  if 


2  1/. 

3  >  n-1  +  2,  {2(n-1)}^  . 

T-a 


(3.1) 


The  RHS  is  precisely  the  two-terra  Cornish-Fisher  expansion  for  the 
2  2 

( l-o ) -quantile  Xn_1;1_Q  of  the  Xn_^  distribution.  Table  3.1  gives 

g 

some  numerical  values  of  the  size  of  the  test,  ai^cox*'  for  d*-^erent 

g 

values  of  a  and  n.  Numbers  in  parentheses  give  the  ratio  a^/a.  The 

g 

entries  indicate  that  aj  ^ ^cox  *  1,8  biased  upward. 


Table  3.1. 

g 

Values  of  a 

and  a8/a  for 

<t> 

xox 

2 

2 

% 

:  a  =*1  vs. 

H1  :  a  =2) 

m 

e 

10 

20 

100 

.05  .070(1.4) 

.067(1.34) 

.064(1.28) 

.057(1.14) 

.01  .032(3.2) 

.026(2.6) 

.022(2.2) 

.016(1.6) 

.005  .024(4.8) 

.018(3.6) 

.014(2.8) 

.009(1.8) 

2  2 
(b)  Hq:0  =1  vs.  :  0  =*  1/2. 

As  in  (a),  4  and  are  both  equivalent  to  the  uniformly 

2 

most  powerful  invariant  test  which  rejects  for  small  s  .  The  rejection 
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Table  3.2  shows  that  aj < ^cox '  now  biased  downward;  numbers  in 
parentheses  again  give  a®/a.  The  reason  for  the  many  zeros  in  the 
table  is  because  the  RHS  of  (3.2)  is  negative  or  close  to  0  for  those 
values  of  a  and  n. 


Table  3.2. 

Values  of  a* 

and  o®/a  for 

*COX 

•V 

{Io_ 

2 

:  o  "1  vs. 

^  :  a2  -  .5) 

100 

5 

10 

20 

.05 

0(0) 

.009 ( . 18 ) 

.024( .48 ) 

.040 ( .8 ) 

.01 

0(0) 

0(0) 

0(0) 

.005 ( .5 ) 

.005 

0(0) 

0(0) 

0(0) 

.002( .4) 

(c)  Hq  s  a  <1  vs.  s  o  >1. 

*2  2 
Let  be  the  maximum  likelihood  estimate  of  o  under  Hi#  i 

“2  2  “2  2 

0,1.  Then  oq  -  min(s  ,1),  0,^  -  max(s  ,1)  and 

T  “  {.5(s2-1)  -  log  s}  sgn(s2-1)  , 


the  latter  being  an  increasing  function  of  s  .  Hence  4>  _  rejects 

*  AK 

2  “2  2 

H0  at  nominal  level  a  if  ns  >  <xnx  ,  .  $  which  reduces  to 

w  u  n- 1  *  i “Ct 

ns2  >  Xn  1  1-a  if  ®2  >  1 
n-l,  1-a 


n  >  Vi.i-o  if  *2  < 1  • 

2 

Clearly  for  a-values  satisfying  xn-1  1_a  >  n,  <t»pAR  yields  the 
uniformly  most  powerful  level  a  test.  This  will  be  the  case  for  the 
levels  used  in  practice.  Otherwise,  $  rejects  Hq  regardless  of 
the  data.  Therefore 
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ct  (d>  ) 

I  *PAR 


otherwise 


Since  for  fixed  1/2  <  a  <  1, 

2 


v_14 


v  -  x  -  =  0( 

Av, 1-a 


)  as  v  ♦ 


(3.3) 


we  conclude  that  $  is  not  even  asymptotically  level  a  for 

FAR 


a  e  (.5,1). 


“2  -  V> 

To  derive  let  the  interval  with  endpoints  o  exp(±  n  2k  ) 

on 


be  a  confidence  interval  for  o  under  H  such  that  k  <*>,  and 

0  n 


:  ♦  0  as  n  ♦  •.  Bie  rejection  region  for 


then  has  the  form 


2  a2  2  *2  2  »  | / 

ns  >  oQ  Vl  where  oQ  *  mints  exp(n  ^kn),1).  Hence 


rejects  H0  if 


tns2  >  x2  .  «  and  s2  exp(n  ^k  )  >  1} 
n- 1 , i-a  n 


or  ^ 

tn  exp(-n  ^  k  )  >  x2  «  «  and  s2  exp(n  ^k  )  <  1} 
n  n- 1,1 ”«  n 

Clearly,  at  the  usual  levels  of  a  (<  1/2),  is  also  uniformly  most 
powerful  level  a  for  sufficiently  large  n.  Unlike  *  ,  however, 

«  AK 

(3.3)  and  the  conditions  on  kR  imply  that  a®($#)  ♦  a  as  n  ♦  «®  for 
all  0  <  a  <  1. 

It  is  noted  that  is  not  valid  in  the  present  situation 

because  the  LHS  of  (1.2)  does  not  converge  to  a  normal  distribution. 

V<>  '2 

This  is  partly  due  to  the  fact  that  n  z ( o^  -  1 )  is  not  asymptotically 
normal  (cf.  White,  1982). 

Similar  results  to  (c)  carry  over  to  the  problem  of  testing 
Hp  :  N( 0, 1 ) ,  0  <  0p  vs.  H y  :  N(0,1),  0  >  0p.  The  next  example  shows 


at  its  worst 


ile  3.3.  Testing  the  location  of  an  exponential  distribution 


Let  )  be  a  random  sample  from  the  exponential 

distribution  with  density  exp(6-x),  x  >  0,  and  consider  testing 

HQ  s  8  >  0  vs.  j  8  <  0.  The  maximum  likelihood  estimators  of  8 

*  + 

under  HQ  and  are  0Q  -  X^j,  8^  ■  X^j  respectively,  where  X^ 

is  the  smallest  order  statistic  and  x~  *  min(x,0),  x+  ■  max(x,0).  An 
easy  calculation  yields 


T 

n 


X(1)  if  X(1)  >  0 


if  X(1)  <  o  . 


Since  P0^X(i)  <  ®  ”  n  1  logd-a))  -  a,  ^pAR  rejects  HQ  if  X^j  <  0 

*  —  4  A 

or  x(i )  <  ®q  ”  n  logd-a).  Substitution  for  8Q  shows  that  for  all 

values  of  x( t ) '  least  one  of  these  tiro  inequalities  is  satisfied. 

Hence  4>pAR  rejects  HQ  with  probability  one  for  all  a  and  n. 

To  derive  we  use  the  fact  that  under  HQ, 

(X^j  +  n  1  log  °n»X(i)J  is  a  100(1-an>%  confidence  interval  for  8, 

where  a^  ♦  0.  Using  this  has  the  rejection  region 

X  .  <0  or  x,..<{x,..+n1  log  a  }+  -  n  1  logd-a) 
d  J  U I  ii;  n 


this  reduces  to 


-1 


“(1) 


<  min{-n  logd-a),  -n 


-1 


log  a  } 
n 


or 

{X  >  -n  1  log  «  and  -n  log  a  <  -n  ^  logd-a)}  . 
d  )  n  n 

Let  n  be  the  greatest  integer  n  satisfying  -n  1  log  a  < 
a  n 

-n  ^  logd-a).  If  n  <  nQ,  rejects  Hg  with  probability  one 
regardless  of  the  data.  On  the  other  hand,  if  n  >  rejects 

Hfl  whenever  X  <  ~n  1  logd-a).  This  coincides  with  the  rejection 


region  of  the  uniformly  most  powerful  level  a  test.  Therefore  we  have 
«£(♦*)  •  a  for  all  n  >  na  and  all  a.  As  in  part  (c)  of  the  earlier 
example,  ♦CQX  is  inapplicable  here. 
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Three  a; 


xamples  considered  In  Cox  ( 1 962 ) . 

We  consider  in  this  section  three  examples  in  Cox  (1962)  where 
analytic  solutions  for  the  level  and  power  of  the  tests  are  not 
available 4  The  comparisons  are  therefore  based  on  Monte  Carlo 
simulation.  Only  the  nominal  level  of  a  *  .05  is  investigated. 

Although  the  procedure  4*  as  described  in  section  2  is 
computationally  quite  easy  to  apply  on  any  one  data  set,  a  Monte  Carlo 

4 

evaluation  of  its  performance  using,  say,  10  simulated  data  sets  can 
be  a  time  consuming  task.  To  reduce  this  effort,  the  following 
modification  of  is  adopted  here.  In  the  first  stage,  after 

deciding  on  the  parameter-ranges  of  interest,  a  grid  of  between  20 

to  30  8-values  is  selected.  For  each  selected  9,  the  critical  value 

1/_ 

c'(8,a,n)  of  n2T  is  approximated  on  a  computer  by  simulating 

n 

10,001  values  of  n^T  and  setting  c'(0,a,n)  to  be  the  9,501st 

n 

ordered  value.  The  points  {(0,c* (8,a,n) ) }  are  then  smoothly 
interpolated  with  a  cubic  spline  to  yield  an  approximation  0^(0, a, n) 
to  c'(0,a,n).  This  curve  is  stored  for  use  in  the  second  stage,  where 

4 

for  each  desired  member  of  the  null  or  alternative  hypotheses,  10 

sets  of  pseudo-random  samples  of  size  n  are  simulated.  For  each  set, 
V- 

the  values  of  n^TR,  0  and  the  predetermined  confidence  interval 

A 

1^(0)  are  computed.  Finally  the  test  is  said  to  reject  the  null 

hypothesis  for  that  data  set  if 
1/  *  * 

n  2T  >  c  (8,a,n)  =  max{c  (0,a,n)  s  0  e  I  (0)} 
ns  s  n 

Note  that  because  c_  is  a  cubic  spline,  this  maximization  is  quite 
trivial.  With  this  modification,  the  computer  evaluation  of  can  be 


done  very  quickly. 

To  achieve  maximum  correlation  in  the  results,  the  sane  simulated 

data  sets  were  used  to  assess  $ _ .  The  standard  errors  in  the 

COX 

resulting  probabilities  of  rejection  are  roughly  about  .002. 

Because  does  not  permit  a  similar  modification,  its 

«AK 

evaluation  is  included  only  in  Example  4.1.  There,  for  each  of  10^ 
sets  of  pseudo- random  samples,  201  bootstrap  samples  were  simulated  to 

A 

obtain  c'(6,a,n).  The  standard  error  of  the  results  for  i  is 
therefore  at  least  .008. 

All  the  computations  were  done  on  a  VAX  11/750  computer.  Pseudo¬ 
random  numbers  were  generated  via  the  International  Mathematical  and 
Statistical  library,  and  the  FORTRAN  program  in  Fbrsythe,  Malcolm  and 
Holer  (1977,  Chap.  4)  used  to  fit  cubic  splines. 

Example  4.1.  Lognormal  versus  Exponential . 

Let  (X1,...,Xn)  be  a  random  sample  from  a  distribution  with 
density  h(x)  and  consider  testing 

Hq  s  h(x)  »  f(x,u,<?)  vs.  H1  :  h(x)  -  g(x,b) 
where  x  >  0,  and 

f(x,u,o)  »  {x0(2*)^2}  1  exp{-(log  x  -  ji)2/(2o2)} 
g(x,b)  -  b  1  exp(-x/b) 

Jackson  (1968),  Atkinson  (1970)  and  Epps,  Singleton  and  Pulley  (1982) 
have  also  considered  this  problem.  We  will  use  tBSP  to  denote  the 
latter  test  in  the  sequel.  From  Cox  (1962),  the  nominal  level  a 

rejection  region  for  is 

A  —  A2  -I/,  -2  A2  A4  1/, 

u  -  log  X  +  o  /2  >  n  2  z^^expta  )  -  1  -  a  -  a  /2}'2 
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(4.1) 


A2  -1  *  2  *  -1 
when  o  ■  n  Klo?  X.  -  p)  and  p  *  n  E  log  X^.  Also, 

A  A  __ 

T  ■  log  0  +  p  -  log  X  +  (log  (2x)  -  1)/2  . 

n 

To  understand  the  relative  merits  of  ^CQX  and  the  following 

restarts  may  be  helpful.  First  note  that  the  problem  is  invariant  under 

scalar  Multiplication  and  both  and  Tn  are  scale  invariant. 

Therefore  the  problem  can  be  reduced  by  restricting  to  scale  invariant 

teats.  It  can  be  verified  that  for  each  Og,  the  uniformly  most 

powerful  invariant  test  of  the  smaller  null  hypothesis  Hg  t  h(x)  - 

f(x,p,0g)  vs.  H^  rejects  Hg  if 

8  (oA)  =  n  ^(nrl)  log  oA  +  p  ♦  o2/(2oA)  -  log  X 
n  u  o  o 

is  too  large.  Since  the  LHS  of  (4.1)  is  equal  to  8n( 1 ) ,  we  see  that 
#0QX  is  an  approximation  to  the  uniformly  most  powerful  invariant  test 
for  ■  1.  The  value  0*1  is  in  some  sense  least  favorable  because 
under  Hg, 

2  « 

BT  ♦  log  o  -  o  / 2  +  constant,  as  n  ♦  •  , 

n 

A 

and  the  limit  is  maximised  at  0*1.  On  the  other  hand,  T  -  S  (o)  ♦ 

n  n 

.5  log  (2x)  -  1  as  n  ♦  •.  Therefore,  since  and  4  reject  for 

large  values  of  Tr,  they  also  approximate  the  uniformly  most  powerful 

A 

invariant  test,  but  by  first  estimating  the  unknown  oQ  with  0.  These 
observations  suggest  that  $CQX ,  ♦  and  would  all  be  reasonable 
for  large  n. 

He  investigated  the  performance  of  these  tests  by  Monte  Carlo 
simulation  for  n  “  20.  Table  4.1  presents  the  results  for  the 
probability  of  a  type  I  error,  a  .  The  figures  for  4  are  quoted 

*  ESP 

from  Epps,  Singleton  and  Pulley  (1982).  Two  spline-fitted  curves  c# 


are  used,  one  to  obtain  the  first  two  rows  of  Table  4.1,  and  another  for 


the  remaining  rows.  The  first  curve  is  based  on  a  grid  of  twenty-five 
equally  spaced  values  of  log  a  centered  at  log(.007).  The  second 
curve,  shown  in  Figure  4.1,  uses  a  grid  of  thirty-one  equally  spaced 
log  a  values  centered  at  0.  The  spacing  of  the  grid  points  for  both 

-  V> 

curves  is  (2n)  2  log  log  n.  In  the  computations  for  we  took  the 

A  Vo 

interval  with  endpoints  log  o  +  (2/n)  2 loglog  n  as  the  confidence 

A 

interval  I  (8)  for  log  o.  Table  4.2  shows  the  powers  of  the  four 
n 

tests.  The  data  indicates  that  besides  having  very  low  power,  has 

size  in  excess  of  0.4.  appears  to  control  the  significance  level 

quite  well,  with  only  slight  loss  of  power  compared  to  $  . 


Table  4.1.  Prob  (Type  I  error)  for  H0  i  lognormal 


vs.  H.  :  exponential 


L8- 


Tables  4.3  and  4.4  show  the  corresponding  results  with  the  roles  of 
the  hypotheses  Interchanged.  How  *  and  are  the  sane  tests 

r  AK 

because  the  distribution  of  Tn  Is  Invariant  over  Hg  s  exponential. 

The  powers  of  4C0X  and  appear  comparable. 


Table  4.3.  Prob  (Type  I  error)  for  Hn  :  exponential 


vs.  :  lognormal,  a  m  .05,  n  *  20. 


.0587  .105 


.0517 


From  Epps,  Singleton  and  Pulley  (1982). 


Table  4.4.  Power  of  tests  for  HQ  s  exponential 


i. 


:  lognormal,  a  “  .05,  n  *  20. 


0 

*cox 

♦  f 
'•’ESP 

^PAR'^* 

,5 

.9999 

.9999 

.0 

.4014 

.22 

.3713 

,414 

.5482 

.60 

.5297 

,0 

.8951 

■  04 

•  1  a. 

.8885 

Let  (X^,...,X  )  be  a  random  sample  from  a  distribution  with 
probability  function  h(x).  We  wish  to  test  Hq  t  h(x)  ■  XXexp(-X)/xl 

X  ^ 

vs.  H1  t  h(x)  -  0  /(1+0)  ,  where  x  *  0,1,2,...  in  either  case. 

A  A 

The  maximum  likelihood  estimators  X  and  6  are  both  X  and  Tn  ■ 
n-1  E  log  XA!  +  X  -  (1+X)log(1+X).  Cox  (1962)  showed  that  ♦CQX  has 


the  rejection  region 


E  log  XJ,1  -  ntf(X)  >  *1_a{nvf(X)>1^ 


where  l ^  and  vf  are  functions  defined  therein.  A  short  table  of 
values  of  these  as  well  as  other  needed  functions  are  given  in  Cox 
(1962).  We  obtained  other  values  by  spline  interpolation. 

V> 

With  n  ■  20,  a  grid  of  20  X2 -values  was  used  to  construct  the 


spline-smoothed  curve  c  .  The  confidence  interval 

B 

*  *  1i  -  1i  A 1a  -  Ia 

I  (X)  ■  (X  '2-  n  72  loglog  n,  X  72  +  n  ^  log  log  n 
n 


log  log  n)  (0,«»)  for  X/2 


was  used  in  the  computations  for  Table  4.5  presents  the  results. 

Corresponding  results  with  the  hypotheses  interchanged  are  shown  in 

1a 

Table  4.6.  Here  the  confidence  interval  for  0  2  used  was 

I  (0)  ■  (0  n  ^loglog  n(1+0)  \  0  ^  +  n  ^loglog  n  (1+0)  ^  ) 

n 

O  (0,»).  It  is  clear  from  the  tables  that,  for  the  parameter  values 
considered,  ♦  and  are  practically  equal  in  performance.  For 

values  of  X  and  0  closer  to  0,  however,  the  discrete  nature  of 
Tn  will  progressively  cause  and  to  have  arbitrarily  low 


power,  and  some  sort  of  randomization  will  be  necessary 


05,  n  -  20 


for  Hg  :  Poisson  vs.  :  Geometric,  a 


X 

Prob  (Type 

^COX 

X  error) 

♦* 

3 

^COX 

Power 

.30 

.0527 

.0309 

.30 

.199 

.156 

.45 

.0488 

.0452 

.45 

.272 

.261 

.60 

.0478 

.0475 

.60 

.362 

.360 

.75 

.0431 

.0427 

.75 

.453 

.450 

.90 

.0427 

.0398 

.90 

.543 

.533 

Table  4.6.  Prob  (Type  I  error)  and  power  for 
Hq  :  Geometric  vs.  :  Poisson,  a  “  .05,  n  **  20. 


B 

Prob  (Type 

*COX 

I  error) 

♦* 

X 

Power 

*COX 

♦* 

.30 

.0039 

.0040 

.0185 

.0190 

.45 

.0111 

.0137 

.0800 

.0895 

.60 

.0182 

.0250 

.165 

.201 

.75 

.0200 

.0315 

.75 

.254 

.321 

.90 

.0241 

.0397 

.90 

.351 

.435 

a  4.3. 

Quantal  response. 

Let  (X^ , . . .  ,XJc )  be  independently  binomially  distributed  with 


indices  n^,...,^  and  parameters  f^(y),.. 
g1(8),...,gk<0)  under  Hj,  where  f^(Y)  * 

1  -  exp(-Bx^)  -  Bx^  exp(-Bx^)  for  a  set  of 
H«  and  H„  have  been  called  the  "one-hit" 


.,f^(Y)  under  Hg  and 
1  -  exp(-yx_j)  and  g^(B) 
"dose  levels"  x^,...,x^. 
and  "two-hit"  hypotheses 


Power 


5.  Concluding  remarks . 

We  have  proposed  here  a  test  of  separate  families  of  hypotheses 
which  requires  very  different  assumptions  from  those  for  tests  based  on 
asymptotic  normality  of  the  test  statistics,  and  showed  in  a  series  of 
examples  that  it  is  quite  reasonable.  The  following  points  however 
should  be  mentioned.  First,  it  is  obvious  that  the  results  in  section  2 

A 

remain  true  if  any  consistent  estimator  6  is  used  instead  of  the 
maximum  likelihood  estimator.  Further,  although  these  results  do  not 

A 

require  conditions  on  the  estimator  to  in  ( 1 . 1 ) ,  it  is  intuitively 

A 

plausible  that  if  high  power  is  to  be  achieved  for  to  should  at 

least  be  consistent.  This  condition  is  satisfied  in  all  the  examples. 

It  will  be  noticed  that  in  the  examples,  we  always  have  the 
critical  value  c(6,a,n)  such  that  it  is  either  independent  of  9  or  a 
function  of  a  one-dimensional  component  of  9.  In  situations  where  this 
is  not  so,  the  practical  implementation  of  can  be  difficult,  since 

we  have  to  search  for  the  maximum  of  a  function  in  high  dimensional 
space.  In  contrast,  $£qX  does  not  have  the  same  computational 
problem.  However,  as  we  saw  in  Example  4.1,  the  size  of  ®*y  not 

be  close  to  its  nominal  level,  and  this  phenomenon  may  worsen  in  higher 
dimensions. 
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